Data Preparation Framework for Preprocessing Clinical Data in Data Mining
نویسندگان
چکیده
Electronic health records are designed to provide online transactional data recording and reporting services that support the health care process. The characteristics of clinical data as it originates during the process of clinical documentation, including issues of data availability and complex representation models can make data mining applications challenging. Data preprocessing and transformation are required before one can apply data mining to clinical data. In this article, an approach to data preparation that utilizes information from the data, metadata and sources of medical knowledge is described. Heuristic rules and policies are defined for these three types of supporting information. Compared with an entirely manual process for data preparation, this approach can potentially reduce manual work by achieving a degree of automation in the rule creation and execution. A pilot experiment demonstrates that data sets created through this approach lead to a better model learning results than a fully manual process.
منابع مشابه
Enhancing Learning from Imbalanced Classes via Data Preprocessing: A Data-Driven Application in Metabolomics Data Mining
This paper presents a data mining application in metabolomics. It aims at building an enhanced machine learning classifier that can be used for diagnosing cachexia syndrome and identifying its involved biomarkers. To achieve this goal, a data-driven analysis is carried out using a public dataset consisting of 1H-NMR metabolite profile. This dataset suffers from the problem of imbalanced classes...
متن کاملGEOARM: an Interoperable Framework to Improve Geographic Data Preprocessing and Spatial Association Rule Mining
Geographic data preprocessing is the most expensive and effort consuming step in the knowledge discovery process, but has received little attention in the literature. For the data mining step, especially for association rule mining, many different algorithms have been proposed. Their main drawback, however, is the huge amount of generated rules, most of which are well known patterns. This paper...
متن کاملEvaluation of Data Mining Algorithms for Detection of Liver Disease
Background and Aim: The liver, as one of the largest internal organs in the body, is responsible for many vital functions including purifying and purifying blood, regulating the body's hormones, preserving glucose, and the body. Therefore, disruptions in the functioning of these problems will sometimes be irreparable. Early prediction of these diseases will help their early and effective treatm...
متن کاملIdentification of the most important factors of ethnic differences in anthropometric dimensions of Iranian workers using the decision tree
Background and aims: Anthropometry is the branch of human science that considers the physical measurement of the human body, especially size and shape. One application of anthropometrical data in ergonomics is the design of working space and the development of industrialized products. So that the tools, equipment and workstations, which designed based on the physical dimensions of the workers, ...
متن کاملAn Optimal Model for Medicine Preparation Using Data Mining
Introduction: Lack of financial resources and liquidity are the main problems of hospitals. Pharmacies are one of the sectors that affect the turnover of hospitals and due to lack of forecast for the use and supply of medicines, at the end of the year, encounter over-inventory, large volumes of expired medicines, and sometimes shortage of medicines. Therefore, medicine prediction using availabl...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- AMIA ... Annual Symposium proceedings. AMIA Symposium
دوره شماره
صفحات -
تاریخ انتشار 2006